-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Fix x86 stack probing #23881
Fix x86 stack probing #23881
Conversation
On x86, structs are passed by value on the stack. We copy structs to the stack in various ways, but one way is to first subtract the size of the struct and then use a "rep movsb" instruction. If the struct we are passing is sufficiently large, this can cause us to miss the stack guard page. So, introduce stack probes for these struct copies. It turns out the stack pointer after prolog probing can be sitting near the very end of the guard page (one `STACK_ALIGN` slot before the end, which allows a "call" instruction which pushes its return address to touch the guard page with the return address push). We don't want to probe with every argument push, though. So change the prolog probing to insert an "extra" touch at the final SP location if the previous touch was "too far" away, leaving at least some buffer zone for un-probed SP adjustments. I chose this to be the size of the largest SIMD register, which also can get copied to the argument stack with a "SUB;MOV" sequence. Added several test case variations showing different large stack probe situations. Fixes #23796
@dotnet-bot test Windows_NT x64 Checked pmi_asm_diffs |
@dotnet-bot test Windows_NT x86 Checked pmi_asm_diffs |
@dotnet-bot test Windows_NT arm Cross Checked Innerloop Build and Test |
No diffs on x64; no interesting diffs on x86 (I did swap the order of two instructions for constant localloc probes, but the order doesn't affect semantics.) -- for Windows PMI frameworks/benchmarks diffs. |
@briansull PTAL |
@dotnet-bot test Windows_NT arm Cross Checked Innerloop Build and Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good
I think we should not merge any new tests as Pri1, at least until we get a simple option to trigger Pri1 tests in a pr. I usually push them as pri0 and change to pri1 before the merge (without waiting the actual testing for the change that only changes the priority). These new tests are failing on many platforms and our weekend testing is all red. Example: outerloop testing for this PR:
|
Fixed: #23996 I thought there already was a "azp ... outerloop" option. (I didn't trigger it, and probably should have.) Unfortunately, I only triggered x86 Pri-1, and the failure was with the new test on 64-bit. |
My bad, it still exists. I thought it was disabled from the jobs that can be triggered with "azp r_u_n" together with jitstress and gcstress jobs, but apparently it was not. |
* Fix x86 stack probing On x86, structs are passed by value on the stack. We copy structs to the stack in various ways, but one way is to first subtract the size of the struct and then use a "rep movsb" instruction. If the struct we are passing is sufficiently large, this can cause us to miss the stack guard page. So, introduce stack probes for these struct copies. It turns out the stack pointer after prolog probing can be sitting near the very end of the guard page (one `STACK_ALIGN` slot before the end, which allows a "call" instruction which pushes its return address to touch the guard page with the return address push). We don't want to probe with every argument push, though. So change the prolog probing to insert an "extra" touch at the final SP location if the previous touch was "too far" away, leaving at least some buffer zone for un-probed SP adjustments. I chose this to be the size of the largest SIMD register, which also can get copied to the argument stack with a "SUB;MOV" sequence. Added several test case variations showing different large stack probe situations. Fixes #23796 * Increase the argument size probe buffer * Formatting
* Fix x86 stack probing On x86, structs are passed by value on the stack. We copy structs to the stack in various ways, but one way is to first subtract the size of the struct and then use a "rep movsb" instruction. If the struct we are passing is sufficiently large, this can cause us to miss the stack guard page. So, introduce stack probes for these struct copies. It turns out the stack pointer after prolog probing can be sitting near the very end of the guard page (one `STACK_ALIGN` slot before the end, which allows a "call" instruction which pushes its return address to touch the guard page with the return address push). We don't want to probe with every argument push, though. So change the prolog probing to insert an "extra" touch at the final SP location if the previous touch was "too far" away, leaving at least some buffer zone for un-probed SP adjustments. I chose this to be the size of the largest SIMD register, which also can get copied to the argument stack with a "SUB;MOV" sequence. Added several test case variations showing different large stack probe situations. Fixes dotnet/coreclr#23796 * Increase the argument size probe buffer * Formatting Commit migrated from dotnet/coreclr@d586523
On x86, structs are passed by value on the stack. We copy structs
to the stack in various ways, but one way is to first subtract the
size of the struct and then use a "rep movsb" instruction. If the
struct we are passing is sufficiently large, this can cause us to
miss the stack guard page.
So, introduce stack probes for these struct copies.
It turns out the stack pointer after prolog probing can be sitting
near the very end of the guard page (one
STACK_ALIGN
slot beforethe end, which allows a "call" instruction which pushes its
return address to touch the guard page with the return address push).
We don't want to probe with every argument push, though. So change
the prolog probing to insert an "extra" touch at the final SP location
if the previous touch was "too far" away, leaving at least some
buffer zone for un-probed SP adjustments. I chose this to be the
size of the largest SIMD register, which also can get copied to the
argument stack with a "SUB;MOV" sequence.
Added several test case variations showing different large stack
probe situations.
Fixes #23796